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ABSTRACT 



The general purpose of this research project was to 
discover those personality characteristics which differentiate 
college students who tend to learn more effectively from one 
instructional format than from another. Two college courses were 
studied concurrently and four different teaching conditions were 
utilized in each course. A comprehensive battery of personality 
inventories was administered to each of the students, and three types 
of criterion measures were collected in both courses. Chapter 1 
presents the problem. Chapter 2 discusses the methodology of the 
project and details the procedures used in the two experimental 
courses. Chapter 3 focuses on the main effects: those due to 
treatment variables (i.e., the relationships between the 
instructional conditions and the course outcomes) and those arisinq 
from the personality variables (i.e., the relationships between scale 
scores and the criterion measures). Chapter b presents the major 
trait-by- treatment interactions based upon the a priori personality 
scales. Chapter c describes the construction of new empirical 
interaction scales and presents the results usinq this strategy of 
scale construction. Chapter 6 reviews and discusses the major 
findings, and Chapter 7 summarizes the report. (Author/A?) 
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Preface 



The goal of this research project was to discover those personality charac- 
teristics of college students which predispose them towards learning more effec- 
tively from one — rather than some other — particular instructional format. The 
program is predicated upon an assumption that no single college instructional 
procedure will be best for all students, but rather that there is an interaction 
between the personality of the student and the optimal method of teaching him. 

The present project serves to expand our knowledge of this interactive process 
by examining the characteristics of students which influence their relative per- 
formance in different instructional methods. The findings from this project — if 
replicated in other college courses— could have important implications for basic 
knowledge of critical personality differences among college students, and for 
applied practices aimed at grouping students into more homogeneous classes, each 
of which might profitably be taught by some different instructional procedure. 

Approximately 900 students in each of two college courses were taught by one 
of four different instructional formats, two of which lie near each of the poles 
on the general dimension of "degree of course structure." Most of these students 
completed an extensive battery of personality measures which yielded over 350 
test scores for each individual. Three broad classes of criterion information 
were assessed from each student in each of the two courses: (a) knowledge of 

course content, as measured by two comprehensive examinations (one of which 
included both an essay and a multiple-choice portion) , (b) the amount of course- 
related but non-graded reading each student carried out during the course, and 
(£) the degree of student satisfaction with the course. This Report is focused 
upon the relationships between the student personality characteristics and these 
criterion measures among those students in each of the differing instructional 
formats. These interactive relations were explored both through the analysis of 
existing (a prior i) personality scales, and through the development of new 
empirical interaction scales. 
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Chapter I 
THE PROBLEM 

Over the years , in a continued effort to improve the practice of higher 
education, a number of investigators have attempted to assess the differential 
effects of various teaching procedures upon student achievement in college courses. 
The instructional methods which have been compared in studies of this sort can be 
divided into at least two major types: (a) variations in teaching techniques or 

"instructor input" and (b) variations in mode of performance or "student output." 

Examples of research on the effects of different instructor inputs include 
comparisons between large vs. small (e.g., Goldberg, 1964), required vs. elective 
(e.g., Goldberg, 1964), or homogeneous vs. heterogeneous (e.g. , Longstaff, 1932) 
classes; lectures vs. group discussion (e.g., Guetzkow, Kelly, 6 McKeachie, 1954; 
Hurst, 1963); lectures vs. independent study or self-study (e.g., Koenig 6 McKeachie, 
1959; Ulrich 6 Pray, 1965); face-to-face vs. televised instruction (e.g., Gulo 6 
Nigro, 1966; Husband, 1954); textbook vs. programmed reading (e.g., Goldberg, 

Dawson, 6 Barrett, 1964; McGrew, Marcia, 6 Wright, 1966; Rawls, Perry, 6 Timmons, 
1966; Ripple, 1953; Young, 1967); and variations among "teaching styles" (e.g.. 

Coats 6 Smidchens, 1966; Haines 6 McKeachie, 1967; McKeachie, 1954, 1958, 1968; 
McKeachie, Lin, Milholland, 6 Isaacson, 1968), or grading policies (e.g., Goldberg, 
1965), or feedback methods (e.g., Anderson, White, 6 Wash, 1966; Sassenrath £ 
Garverick, 1965). Examples of research on the effects of different student out- 
puts include such comparisons as those between quiz and essay examinations (e.g., 
Guetzkow, Kelly, £ McKeachie, 1954), and among various frequencies of quizzes 
(e.g., Fitch, Drucker, £ Horton, 1951; Longstaff, 1932). 
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Educational research of both types has been reviewed by Wolfle in 1942, 

Good in 1952, and later by McKeachie (1961, 1962, 1963). An excellent summary 
of research on the comparative effectiveness of various teaching procedures has 
recently been published (Dubin and Taveggia, 1968), and consequently these 
studies will not be reviewed again here. With relatively few exceptions, the 
overwhelming finding that has emerged from the hundreds of studies of both kinds 
is that differing college instructional procedures do not appear to produce any 
consistent differences in average course achievement. 

At least three hypotheses have been proposed to account for this general 
finding. In the first place, it may be that most of the failures to find 
differences between teaching conditions have foundered on the shoals of crude 
criterion measures. Perhaps all instructional techniques differentially affect 
students to some degree, but present instruments simply are not sensitive enough 
to detect these differences. For example, critics of studies comparing televised 
with face-to-face instruction have attempted to minimize the evidence that 
televised instruction appears to produce no more learning than traditional 
instruction by suggesting that tests tapping visual content would demonstrate such 
a superiority. While it is reasonable to assume that most measures of academic 
achievement could be improved, nonetheless wnen one considers the special atten- 
tion given to criterion measurement in a host of previous studies (e.g., Guetzkow, 
Kelly, 6 McKeachie, 1954), it is doubtful whether faulty criteria per se can be 
blamed for most of the negative findings. 

A second hypothesis which could account for the lack of differences 
between instructional techniques points an accusing finger at the methods, 
themselves. Just as extremist political groups have accused Republicans and 
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Democrats of providing the voter with "no real choice," so some critics of 
past educational research have deplored the lack of imagination of college 
instructors in finding any radically different types of instructional formats. 
While college professors are increasingly being viewed as "traditional" and 
"conservative" (in practice if not in ideology), is it X'easonable to suppose 
that such diverse instructional procedures as lectures, programmed textbooks, 
drill instruction, telecourses , group discussions, and independent study offer 
no real choice? 

A third explanation for the failure to find significant differences among 
teaching methods stems f 1 om the belief that neither the instruments nor the 
teaching procedures are at fault, but rather that college instruction is a 
more complicated research area than had initially been assumed. The heart of 
the third hypothesis lies in the assumption that there is an interaction 
between teaching methods and characteristics of the learner, and that the 
techniques which are the best for some students may be the worst for others. 
McKeachie, for example, has written: 

"One possible partial explanation for the meager findings. . . is 
that teaching methods affect different students differently. Students 
who profit from one method may do poorly in another, while other 
students may do poorly in the first method and well in the second. 

When we average them together we find little overall difference between 
methods. . (McKeachie, 1961; p. 111-112). 

"Our concern that opportunities for individualized instruction 
be protected is related to an awareness that differences between 
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students are inadequately cared for by our usual teaching methods. 
Experienced teachers have felt for years that no single teaching 
method succeeds with all kinds of students. It is possible that one 
of the reasons for the host of experimental comparisons resulting in 
nonsignificant differences is simply that methods optimal for some 
students are detrimental to the achievement of others. When mean 
scores are compared, one method thus seems to be no different in its 
effect from any others" (McKeachie, 1962; p. 351)- 

The crux of this third hypothesis lies in the concept of a "trait- by— 
treatment interaction" in all human affairs — and all psychological research. 
This concept has begun to gain some currency through the thoughtful and lucid 
exposition by Lee Cronbach (1957) in his A.P.A. presidential address and the 
related monograph by Cronbach and Gleser (1957, 1965) on the application of 
decision-theoretic models to problems of personnel classification. As 
Cronbach has written: 

"My argument rests on the assumption that such aptitude-treatment 
interactions exist. There is, scattered in the literature, a remarkable 
amount of evidence of significant, predictable differences in the way 
people learn. We have only limited success in predicting which of two 
tasks a person can perform better, when we allow enough training to 
compensate for differences in past attainment. But we do find that a 
person learns more easily from one method than another, that this best 
method differs from person to person, and that such between-treatments 
differences are correlated with tests of ability and personality" 
(Cronbach, 1957; p. 681). 




A more recent explication of this position can be found in a chapter 
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entitled "How can instruction be adapted to individual differences?” (Cron- 
bach, 1967) in a book on "Learning and Individual Differences" (Gagne, 1967) — 
a volume which may owe its very existence to Cronbach's previous arguments. 

Pervin (1968) has recently reviewed the experimental literature on trait- 
by-treatment interaction, or in his words on "individual- environment fit." 

Pervin "assumes that for each individual there are environments (interper- 
sonal and noninterpersonal) which more or less match the characteristics of 
his personality. A 'match* or 'best-fit*. . . of individual to environment 
is viewed as expressing itself in high performance, satisfaction, and little 
stress on the system whereas a 'lack of fit* is viewed as resulting in de- 
creased performance, dissatisfaction, and stress in the system" (Pervin, 1968; 
p. 56). 

One concrete example may help to clarify the nature of such potential 
interactions; Kagan (1967) has recently reported the following study: 

"The hypothesis can be simply stated. An individual will attend 
more closely to an initial stranger with whom he feels he shares 
attributes than to a stranger with whom he feels he does not share 
attributes, other things [being] equal. . . . The subjects in this study were 
56 Radcliffe freshmen and sophomores preselected for the following pair 
of traits. One group, the academics, were rated by four judges — all 
roommates — as being intensely involved in studies much more than they 
were in dating, clubs, or social activities. The second group, the 
social types, were rated as being much more involved in dating and 
social activities than they were in courses or grades. No subject 
was admitted into the study unless all four judges agreed that she fit 




one of these groups. 
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"Each subject was seen individually by a Radcliffe senior, and 
told that each was participating in a study of creativity. The subject 
was told that Radcliffe seniors had written poems and that two of the 
poets were selected by the Harvard faculty as being the best candidates. 
The faculty could not decide which girl was the more creative and the 
student was going to be asked to judge the creativity of each of two 
poems that the girls had written. The subjects were told that creati- 
vity is independent of IQ for bright people and they were told that 
since the faculty knew the personality traits of the girls, the student 
would be given that information also. The experimenter then described 
one of the poets as an academic grind and the other as a social activist. 
Each subject listened to two different girls recite two different poems 
on a tape. Order of presentation and voice of the reader were counter- 
balanced in an appropriate design. After the two poems were read the 
subject was asked for a verbatim recall of each poem. . . . The academic 
subjects recalled more of the poem when it was read by the academic model 
than by the social model; whereas, the social subjects recalled more of 
the poem when it was read by the social model than the academic model. 

. . . Distinctiveness of tutor is enhanced by a perceived relation 
between learner and tutor” (Kagan, 1967; pp. 139-140). 

For other illustrations of such trait- by— treatment interaction effects, see 
Carney, 1966; Carson, Harden £ Shows (1964); Colquhoun £ Corcoran (1964); 

Hoehn £ Saltz (1956); Klett £ Moseley (1965); Megargee, Bogart, £ Anderson 
(1966); and Paul £ Erickson (1964). 




Studies of the interaction hypothesis within the context of college 
instruction date back at least a decade or two (e.g. , Wispe, 1951), although 
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only recently have there been any concerted efforts to explore the hypo- 
thesis in a systematic manner. The research programs of the Siegels at 
Miami University (e.g., Siegel £ Siegel, 1964, 1965, 1966, 1967) and 
McKeachie and his associates at the University of Michigan (e.g., Koenig 6 
McKeachie, 1959; McKeachie, 1958, 1961, 1968; McKeachie, Isaacson, Milholland, 

6 Lin, 1968; McKeachie, Lin, Milholland, 6 Isaacson, 1966) are based on this 
hypothesis, as are a number of single studies by other investigators (e.g.. 

Beach, 1960; Denny, Paterson, £ Feldhusen, 1964; Heath, 1964; Lublin, 1965; 

Smith, Wood, Dovmer, 6 Raygor, 1965; Snow, Tiffin, £ Seibert, 1965). A few 
investigators have explored this hypothesis among high school or junior high 
school students (e.g., Osbum £ Melton, 1963; Ripple, Glock, £ Hillman, 1967) 
and military personnel (e.g., Tallmadge, 1968; Tallmadge, Shearer, Greenberg, 

£ Chalupsky, 1968). Reviews of the literature on the interaction hypothesis 
in college instruction can be found in McKeachie (1962, 1963, 1968), and thus 
these studies need not be summarized again here. 

Unfortunately, most of these efforts to demonstrate trait by teaching 
method interaction effects have not been very successful. While a number of 
significant interactions have occurred in isolated investigations (e.g.. Beach, 

1960; Domino, 1958; Heath, 1964; Paul £ Erick sen, 1964; Snov, Tiffin, £ Seibert, 1965 
Tallmadge, Shearer, Greenberg, £ Chalupsky, 1968), they have yet to be repli- 
c?fz od. The few attempts at replication of previous interactions have been — 
by and large — somewhat discouraging (e.g., Gruber £ Weitman, 1962; Koenig £ 

McKeachie, 1973; McKeachie, 1958, 1961, 1963; McKeachie, Lin, Milholland, £ 

Isaacson, 1966; Siegel £ Siegel, 1964, 1965, 1966). In addition, quite a number 
of published studies — not to mention the hidden mass of unpublished ones — 
sought, but did not find, any significant trait by method interactions at 
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all (e.g. , Anderson, White, 6 Wash, 1966; Goldberg, 1964, 1965; Goldberg, 
Dawson, 6 Barrett, 1964; Guetzkow, Kelly, 6 McKeachie, 1954; Lublin, 1965; 
Ripple, Glock, 6 Miliman, 1967; Sassenrath 6 Garverick, 1965; Tallmadge, 1968). 

Why has so appealing an hypothesis borne such fragile fruit? First of 
all, it is important to recognize the sheer statistical problems associated 
with the demonstration of a significant interaction, since the classic general 
linear model first attempts to express all of the covariance in terms of main 
effects and uses only the residual covariance for tests of interaction effects 
(Cohen, 1968; Goldberg, 1968; Hoffman, 1968; Hoffman, Slovic, 6 Rorer, 
1968). As Rorer (1967) and Yntema 6 Torgerson (1961) have demonstrated, there 
is a large class of interactive processes which will produce observations 
quite easily predictable by a linear additive model (i.e. , the main effects 

t 

alone). In the use of linear regression or analysis of variance techniques, 
a non-significant interaction term is no guarantee that the underlying process 
is not an interactive one. Clearly, if we wish to take the interaction 
hypothesis seriously, we must find some new means of testing for interaction 
effects, though this may well violate, in some sense, both the "law of par- 
simony" and the "law of conventional significance testing. 17 

However, there is another — and even more serious — reason why past efforts 
to demonstrate stable trait by teaching method interactions have typically 
failed. Again Cronbach has provided the key: 

"Applied psychologists should deal with treatments and persons 
simultaneously. Treatments are characterized by many dimensions; so 
are persons. The two sets of dimensions together determine a payoff 
surface. For any practical problem, there is some best group of 

treatments to use and some best allocation of persons to treatments. 

O 
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We can expect some attributes of persons to have strong interactions 
with treatment variables. These attributes have far greater practical 
importance than the attributes which have little or no interaction. 

In dividing pupils between college preparatory and non-college studies, 
for example, a general intelligence test is probably the wrong thing 
to use. This test, being general, predicts success in all subjects, 
therefore tends to have little interaction with treatment, and if so 
is not the best guide to differential treatment. We require a measure 
of aptitude which predicts who will learn better from one curriculum 
than from the other; but this aptitude remains to be discovered . 

Ultimately we should design treatments, not to fit the average person, 
but to fit groups of students with particular aptitude patterns. Con- 
versely, we should seek out the aptitudes which cor respond to (interact 
with) modifiable aspects of the treatment (Cronbach, 1957; pp. 680- 
681). [Italics added. 3 

In the above paragraph, Cronbach has made two important points: (a) 

that individuals (and treatments) must be conceptualized in a m ultivariate 
paradigm (e.g., Cattell, 1957; Siegel 6 Siegel, 1967), and (b) that those 
individual difference measures which have gained the widest currency as general 
predictors are the least likely candidates for being good differential (or 
interaction) ones. What is needed, therefore, is an extensive search for 
precisely those measures which, while not showing great promise as general 
predictors, turn out to be consistently associated with interaction effects. 

Yet , virtually all previous studies of trait by teaching method inter- 




actions have utilized only a few personality measures, and these typically 
have been selected because of their easy availability (e.g., sex) and/or 
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because of their previously demonstrated value as general predictors (e.g., 
G.P.A., scholastic aptitude, anxiety, sociability). For example, in a 
systematic research program on college instruction which is explicitly both 
multidimensional and interaction-focused, Siegel and Siegel (1964, 1965, 1966, 
1967) have typically utilized only three to five personality measures (each a 
dichotomized variable) — at least two of which (scholastic ability and prior 
knowledge of course content) are among the sort of general predictors rather 
unlikely to serve much of an interactive function. And, in the other large- 
scale research project on the interaction hypothesis, McKeachie and his 
associates have typically utilized an equally small set of personality 
measures, primarily the project ive -based (and notoriously unreliable) scores 
for need Achievement, need Power, and need Affiliation, plus once again two 
general predictors (scholastic aptitude and test anxiety) — all five being 
rather unlikely candidates for an interaction role. 

While the directors of both research programs might argue that the 
personality measures they utilize are ’’theory-based 17 — stemming on the one 
hand from a general theory of instruction (Siegel 6 Siegel) and on the other 
from a general theory of motivation (Atkinson 6 McKeachie) — it is doubtful 
whether either "theory 15 actually dictated these measurement decisions. For, 
at the moment, we have few theories in psychology — and none in college in- 
struction — which specify the number and nature of those personality charac- 
teristics predisposing students to achieve differentially in different college 
courses (see Bruner, 1961, 1966; Jones, 1968; Siegel, 1967; Skinner, 1968). 

What is needed for the development of such a theory is a broad band-width 




assessment of college students who are randomly assigned to at least two rather 
diverse instructional formats. If a comprehensive set of present-day 
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psychometric measures are tried, some may turn out to be useful interaction 
variables. Or, if the techniques now extant to construct such, instruments 
implicitly guarantee their uselessness in this role, a new set of measures 
will have to be developed. In any case, as Cattell has so cogently stated 
elsewhere : 

. the most revolutionary transitions in sciences have usually 
occurred through methodological innovation rather than grand and 
bookish theories. A new direction and power is usually given by 
devices — as by the microscope, the telescope, and the electron tube, 
or more subtly by stereochemistry or the differential calculus — by 
the light of which all can see emerging new theories. These methodo- 
logical inventions solve new kinds of problems and do so, moreover, 
with altogether more exact standards of what constitutes a solution. 

The more exact theories readily enough follow, because they are made 
possible by the new vision" (Cattell, 1966; p. viii). 

If the interaction hypothesis is a fruitful one — i.e., if powerful in- 
teractions between course treatments and some student personality traits 
actually exist in nature — then clearly it is time to try a broad-band search 
to find measures of such traits. Two tactics may prove necessary. First 
should come a systematic empirical sweep through already-existing personality 
measures to mine off the most promising interaction variables. However, if 
the existing lode appears to be empty, then new measures may have to be de- 
veloped with this specific goal in mind. These are precisely the twin aims 
of the present research project. Hopefully, its "methodological innovations" 
if replicated in subsequent empirical explorations — may then serve to guide 
new theoretical developments. 
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While it would be desire? ole to sample comprehensively both frem the 
large set of potential personality traits and the smaller — but still con- 
siderable — set of instructional treatments, any one project will be forced 
to restrict its scope. The present research program is predicated on the 
belief that--at this stage — comprehensive coverage of personality traits is 
more crucial than equal coverage of instructional formats. Consequently, a 
broad-band set of personality measures was included in the present project, 
and college instructional procedures were limited to four--two of which lie 
near the poles of an important instructional continuum: the degree of struc- 

ture provided the student by the course format. If personality measures can 
be found which interact with treatments classified as either relatively 
"structured 77 or "unstructured then future research can expand the scope of 
this investigation to other variations in instructional treatment. 

However, even within the set of personality measures some sampling is 
necessary; for example, one could utilize the 80 aptitude factors developed 
within the framework of Guilford's (1967) model of the structure of the 
intellect; or conversely, one could opt to exclude aptitude tests and instead 
focus on other personality measures. While both approaches must be tried, 
the present project utilized non-cognitive measures. And, in order to collect 
a large number of such scores from an even larger number of college students , 
it was necessary to eschew all individually-administered instruments (both 
projective techniques — a set easily eliminated on other grounds — and "objective 
tests of personality" [e.g., Cattell 6 Warburton, 1967 1 — a less easily defended 
choice) . 
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An Overview of the Present Research Project 

The general goal of this research program was to discover those personality 
characteristics which differentiate college students who tend to learn more 
effectively from one instructional format than from some other, so that ultimately 
instructional procedures can be more optimally aligned with individual differences 
among students. Two college courses were studied concurrently, and four different 
teaching conditions were utilized in each course. A comprehensive battery of 
structured personality inventories was administered to each of the students, and 
three types of criterion measures were collected in both courses. 

In Chapter II, the methodology of the project is summarized, and the pro- 
cedures used in the two experimental courses are detailed. Chapter III focuses 
solely on main effects- -those due to treatment variables (i.e., the relationships 
between the instructional conditions and the course outcomes), and those arising 
from the personality variables (i.e., the relationships between scale scores and 
the criterion measures). Chapter IV presents the major trait-by-treatment inter- 
actions based upon the a priori personality scales. Chapter V describes the 
construction of new empirical interaction scales and presents the results using 
this strategy of scale construction. In Chapter VI, the major findings are 
reviewed and discussed. Finally, Chapter VII summarizes the entire Report. 



Chapter II 



PROCEDURES 

The Subjects 

The project was carried out within the framework of two concurrent 
Psychology courses, so that any significant findings from one course could be 
immediately replicated in a course containing the same general sort of students 
(i«e. 9 predominantly college sophomores) exposed to material of approximately 
the same level of difficulty but in another content area. The two courses. 
Individual Differences and Developmental Psychology ( Course A) and Personality 
( Course B) , formed the last pair of a three-pair sequence of courses at the 
Introductory Psychology level at the University of Oregon in the Spring Quarter 
of 1965. Students were allowed to choose one course of a pair during each of 
three academic Quarters, thereby fulfilling the requirements for the Intro- 
ductory Psychology sequence. Of the 892 students initially electing either of 
these two courses, complete criterion data were available for 806 — the sample 
used in most of the data analyses. 

The Teaching Methods 

Students in each of the two experimental courses were assigned on a non- 
systematic basis to one of four types of instructional formats. ' These experi- 
mental treatments included two forms of instructor "input" (Traditional lectures 
vs. Self-study instruction) and two forms of student "output" (Multiple-choice 
quizzes vs. Integrative papers), combined to form the four-fold experimental 

design displayed in Table 1. 
ft ■ " 

Students were not allowed any choice of teaching method; they did not know 
before classes began that there was more than one method being offered, and 
transfers between sections were permitted only in a few exceptional cases. 




14 



14a 



Table 1 

The Experimental Design 



Student 

Output 



Instructor Input 



Quiz (Q) 
Sections 



Paper (P) 
Sections 



Lecture (L) 




“V tP 2 



Self-study (S) 
Instruction 



SQ^ SQ 2 




Course A 



Course B 



L S 



L S 



Number of Subjects 
with 


Q 


© 


95 


© 


93 


Complete Criterion Data 


P 


86 


0 


115 


© 
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Insert Table 1 about here 

Within each course, all students in the four LQ and LP sections met together 
in one large lecture hall to receive formal lectures on Mondays and Wednesdays of 
each week. They then met in smaller sections for one hour later in the week. 
Students in the four SQ and SP sections had no formally scheduled class meetings 
on Mondays and Wednesdays, but instead were encouraged to use the additional two 
hours per week for extra reading and studying. A comparison of the performance 
of the students in the Lecture (LQ and LP) with those in the Self-study (SQ and 
SP) sections provides information regarding the differential effects of traditional 
lectures vs. self-study instruction. 

Students in the LQ and SQ sections were administered four multiple- choice 
quizzes during the Quarter, spaced approximately two weeks apart, two during the 
first half of the course and two more during the second half. The quizzes, which 
were about 25 minutes in length, covered material included in the assigned sections 
of the textbooks. After the quiz answer sheets had been collected, the instructor 
provided the students with the correct answers. Concurrently, students in the LP 
and SP sections were required to write four integrative essays during the Quarter, 
to be turned in approximately two weeks apart, two papers due during the first 
half of the course and two more during the second half. Students were encouraged 
to examine critically the material included in the assigned sections of the 
various textbooks, as well as any other material they felt was relevant to the 
topic being considered. 

The quizzes and the papers were graded and returned to the students. The 
final course grade was determined on the basis of the scores from the quizzes or 
papers on the one hand, and the scores on two content examinations on the other, 

0 
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Quiz and paper scores both contributed the same amount (40%) to the final course 
grade. Consequently, any differences in performance between students in the quiz 
sections and those in the paper sections should relate to the differential effec- 
tiveness of these two instructional procedures, rather than to any differential 
perceptions of their weight in determining the course grade. 

All students in both courses were required to attend the weekly section 
meetings, where some of the personality measures were administered and the others- 
taken at home — were collected. Each of these sections was taught by one of four 
Teaching Assistants, who were advanced graduate students in the Psychology Depart- 
ment at the University of Oregon. Two Teaching Assistants were assigned to each 
course, each teaching one section using each of the four treatment conditions 
(e.g., one Teaching Assistant taught sections LQ^ LP^, SQ^ and SP^^ from Course 
A). Consequently, any effects due to the differing personalities of the Teaching 
Assistants were uniformly distributed across the experimental treatments, and 
therefore such effects were not confounded with those of the teaching methods 
themselves. 

While the experimental design for this project allowed a comparison- between 
lecture vs. self-study methods and between quizzes vs. papers, it also permitted 
an examination of the joint effects of these two aspects of college teaching as 
scaled on a potentially more general dimension of college instruction: the degree 

of structure provided by the instructional format. Ordered on this dimension, the 
LQ sections clearly provided the most structure, while the SP sections were 
probably as unstructured as are likely to occur at the undergraduate level. There 
fore, the differential effects of teaching methods located near the two poles of 
the structured vs. unstructured dimension (the circled cells in Table 1) could 




be assessed 
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The Personality Measures 

While the comparative effects of the different teaching methods are of some 
interest, the major innovation of the present study over previous ones lies in 
the administration of a comprehensive battery of personality inventories, in 
order to discover any interactions between student personality characteristics 
and the instructional treatments. These personality measures — which are listed 
in Tables 2 and 3 — were chosen (a) to include those scales which on theoretical, 
or previous empirical, grounds showed any relevance as potential interaction 
variables (e.g., Siegel £ Siegel’s [1965] Educational Set Scale), and (b) to span 
as broadly as possible the range of personality traits presently measured by 
paper-and-pencil questionnaires and inventories. Some of the personality inven- 

Insert Tables 2 and 3 about here 

tories were administered during the section meetings, while others were distributed 
to students to be completed at home and returned the following week. 

Partly as an inducement to obtain their cooperation in the completion of 
the personality inventories, students were told that they could receive their test 
scores at a later date. About two-thirds of the students initially requested 
their scores, and one-quarter of the students actually came back six months later 
to obtain them. Although course grades were not contingent upon completion of 
the inventories, this task was presented as an integral component of the course 
work, and attempts were made toward :he end of the course to obtain any missing 
protocols , 

It is difficult to estimate the effect of the "captive” nature of this 




sample on the reliability of the research data obtained. At the time the course 
was being conducted, it seemed apparent that some students were not attending 
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Table 2 

The Student Characteristics Assessed in this Project 



Administration 



Administered to the Total Sample 


No. 


No. of 




Home 


of 


Scales 


Week 


vs. 


Published Inventories 


Items 


Scored 


No. 


Class 


California Psychological Inventory (CPI) 


480 b 


49 


2 


Home 


Survey of Study Habits £ Attitudes (SSHA) 


75 


8 


3 


Class 


it 

Adjective Check List (ACL) 


300 


26 


3 


Home 


Welsh Figure Preference Test (WFPT) 


400 b 


23 


6 


Class 


Edwards Personal Preference Schedule (EPPS) 


22 5 b 


15 


6 


Home 


Minnesota Multiphasic Personality Inventory (MMPI) 


566 b 


75 


7 


Home 


Strong Vocational Interest Blank (SVIB) 


405 


97 


8 


Class 


Won-Published Inventories and Scales 


A 

Oregon Instructional Preference Inventory (OIPI) 


84 


- 


1 


Class 


c 

Biographical Inventory (BI) 


26 


- 


3 


Class 


Bass' Social Acquiescence Scale (SAS) 


56 


7 


3 


Class 


Reported Behavior Inventory (RBI) 


250 


16 


4 


Home 


Q 

Composite Personal Reaction Inventory (CPRI) 


151 


7 


8 


Home 


Siegel £ Siegel's Educational Set Scale (ESS)* 


93 


7 


8 


Home 


Composite Choice Preference Inventory (CCPI) 


156 


12 


9 


Home 


Other Measures 


- 


23 


- 


- 


Sex 

Class in college 

College grade point average (gPA) 


Scholastic Aptitude Test: Verbal (SAT-V) and Mathematical 

Predicted Peer Ratings (18 CPI Scales) 


(SAT-M) Scores 






Each Administered to (Different) Half -Sample 


16 Personality Factor Questionnaire (16PF) 


187 


23 


5 


Home 


Motivation Analysis Test (MAT) 


208 


45 


5 


Home 



Inventories for which the new empirical interaction scales were constructed. 

a Does not include the empirical interaction scales, nor the "deviancy vs. commonality” 
and "response bias' 1 scales constructed for each of the inventories. 

b Includes 12 (CPI), 20 (WFPT), 15 (EPPS), and 16 (MMPI) duplicated items. 

ERJC- Table 3 * 
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Table 3 



The Variables Included in the Composite Personal Reaction Inventory, 



the Composite Choice Preference Inventory, 
and the Biographical Inventory 



No. 

of 

Composite Personal Reaction Inventory (CPRI) Items 

Barron: Originality Scale 22 

Marlowe-Crowne: Social Desirability Scale 33 

Walk: Intolerance of Ambiguity Scale 8 

Sarason: Test Anxiety Scale 16 

Sarason: Need for Achievement Scale 30 

Sarason: Lack of Protection Scale 27 

Vogel-Raymond-Lazarus : Achievement Values Scale 15 

Composite Choice Preference Inventory (CCPI) 

Liverant-Scodel : Locus of Control Scale 23 

Allport-Vernon-Lindzey : Study of Values (Part I) 30 

Zuckerman : Sensational-Seeking Scales 34 

Forced-Choice Dogmatism Scale 40 

Forced-Choice F-Scale 29 

Biographical Inventory (BI) 

Number and type of previous Psychology courses 3 

Satisfaction with previous Psychology courses 2 

Plans for future Psychology courses 1 

College major and graduate school plans 2 

Occupational choice 2 

Present and past places of residence 2 

Father’s occupation and education 2 

Mother 1 s education 1 

Birth order and number of siblings 3 

Parents* present marital status 1 

Student*s marital status 1 

Employment status and college financing 2 

Expected course grade and expected G*P*A. 3 

Number of friends in the course 1 




No. of 
Scales 
Scored 

1 

1 

1 

1 

1 

1 

1 

1 

6 

3 

1 

1 
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carefully to the research tasks, and therefore attempts were made to identify 
those students who may have been less than candid when taking each inventory. 

One or more of the following methods were available to detect potentially invalid 
protocols: (a_) visual inspection of the answer sheets to eliminate obviously 

invalid protocols (e.g. , many items left blank, all answers marked "True," etc.), 

(b) construction of "response deviancy" scales for each of the inventories, by 
identifying a set of items with extreme response imbalance in the present sample 
and then scoring each subject* s response protocol on each of the new scales in 
order to identify grossly deviant protocols, (£) analysis of responses to the 
repeated items in the CPI, MMPI, EPPS, and WFPT— and the 167 identical items 
common to the CPI and MMPI — to eliminate subjects responding inconsistently, 

(d_) use of previously constructed "response bias'* and "faking" scales on the 
CPI (e.g.. Cm, Wb, Gi) and the MMPI (e.g., L, £, K, F-K , Sd , Mp) , (e) comparison 
of "subtle" vs. "obvious" measures of the same trait, where both were available 
(e.g., the MMPI), (£) the analysis of canonical correlations among all sets of 
inventory scales (e.g., the 18 CPI vs. the 15 EPPS scales) to develop test-to-test 
predictability equations on which each protocol could be scored and deviant 
protocols eliminated, (g) inspection of the four questions on the Course Evaluation 
Questionnaire (see Appendix B) which dealt with student reactions to the personality 
inventories, in order to separate students who claimed to enjoy taking the inven- 
tories from those who did not. 

Methods (ci) and vb) were used for all of the inventories, and methods 

(c) through (g) were employed with some of them. These analyses suggested 
that the proportion of subjects in the project who provided unreliable 
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inventory data was not appreciably greater than might be expected in any 
sample of subjects administered a long battery of psychological tests. While 
further work on this question is still underway, it is important to realize 
that any random errors introduced into the personality data through invalid 
protocols will serve to attenuate all relationships between inventory scores 
and other measures and thus to hide interactions which, under better conditions 
of test administration, might have appeared. Therefore, to the extent to which 
the reader judges this problem to be a significant one, he must entertain all 
the more credence in those relationships uncovered in this project — relation- 
ships which appeared through the fog of these less than ideal test-taking con- 
ditions. For a further discussion of this potential source of error, see Chapter VI. 

Criterion Measures; The Initial Set 

Three general types of criteria were multiply assessed in both of the 
experimental courses: (a) knowledge of course content, (b) the amount of 

extracurricular (non-graded) reading the students carried out, and (£> satis- 
faction with the instructional treatments. Each of these three classes of 
criteria will be discussed in turn. 

Course Achievement . Two content examinations were administered in each 
course, one approximately half-way through the term, and the other at the 
end of the course. Each examination included 10 questions previously included 
in the quizzes and from 60 to 80 new questions. While . 
only the latter were used as measures of course achievement, the in- 
clusion of the former allowed some estimate of the effects of sheer practice 
on examination performance. The second examination in each course included, 
in addition to 60 multiple-choice questions, an integrative essay covering the 
content of the course. Thus, both divergent thinking (as measured by an 
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essay) and convergent thinking (as measured by a multiple- choice examination) 
were available as measures of the course achievement criterion. 

Amount of extracurricular reading . The only unique criterion to be employed 
in this project was one assessing the extent to which students read relevant 
material which, while available to everyone, was explicitly understood as not 
involved in the determination of the course grade. All students in both courses 
were asked to buy a preselected set of 20 reprints from the Scientific American . 
These reprints, the same set for students in both courses, were sold along with 
the textbooks by the University bookstore as material required for each course. 

At the first class meeting, all students were given a course reading list; weekly 
reading assignments from four paperback textbooks were listed as "Required 
Reading" and the Scientific America n reprints most relevant to each topic were 
listed as "Supplementary (Optional) Reading." On the reading lists and on a 
course syllabus distributed at the same time, the following statement appeared: 
"Reading material assigned as 7 Supplementary Reading 7 will not be used for grading 
purposes. 11 In addition, the course instructors emphasized in the first classes 
that while the reprints were relevant to the course and should prove helpful in 
understanding the textbook material, they would not be used for grading purposes. 

The use of these twenty Scientific American reprints thus provided an 
opportunity for assessing the extent to which the different teaching methods 
encouraged extracurricular reading. Questions about each reprint were written 
to assess whether the student had read this material; these questions were con- 
structed so as co be quite easy for anyone who had read the reprint, while 
simultaneously being extremely difficult for anyone who had not read it. All 
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questions were pre-tested on samples of students from another college, half 
of whom had read, and half had not read, the reprints; from a larger pool of 
items, 20 were selected which maximally differentiated the two groups. Con- 
sequently, scores on this test provided relatively precise information on the 
extent to which each student had read this extra material. This test was ad- 
ministered after the final examination in the course, with instructions to the 
students that these scores were only to be used for research purposes. In 
addition, one of the questions on the Course Evaluation Questionnaire, admini- 
stered at the end of the term, asked directly for the number of reprints 
read. 

Satisfaction with the courses . At the very end of the term, a 42-item 
Course Evaluation Questionnaire was administered in both courses. While stu- 
dents were asked to sign these evaluation forms, care was taken to insure the 
student that his candid opinions could not affect his course grade. The 
Evaluation Questionnaire included rating scales tapping attitudes toward 
different aspects of the course, many of which had been developed in previous 
studies of college instruction (e.g., Goldberg, 1964, 1965). The Course 
Evaluation Questionnaire is included in this Report as Appendix B. 

Finally, a short measure of group morale-~in effect, a morale thermometer — 
was administered in all sections of each course every two weeks throughout the 
term. Students were asked to rate their satisfaction with the course; these 

ratings were made anonymously to relieve any possible fear that the evaluations 
might influence course grades . Since measures of group morale were gathered on 
six occasions throughout the term, it was possible to plot a morale curve 




for each section over time and thus to compare teaching methods in terms of 
the relative pattern of these morale curves. However, since this instrument 
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was administered anonymously, it was not possible to relate student personality 
measures to individual morale curves. Since the findings stemming from the 
"morale thermometer" are not central to the interaction hypothesis which guided 
the research project, they are not included in the present report. 

Criterion Measures: The Final Set 

Of the 42 questions in the Course Evaluation Questionnaire (See Appendix 
B), 15 dealt with aspects of the courses which were unique to one or two cells 
of the experimental design (e.g., the value of the lectures), 8 concerned reac- 
tions to the textbooks , and 15 dealt with general — but not criterial — issues. 

The remaining 4 questions, listed in Table 4, were inter correlated , along with 
four achievement test scores: scores from ( a ) the first (multiple-choice) exam- 
ination, (_b) the multiple-choice portion of the second examination, (£) the 

Insert Table 4 about here 

essay portion of the second examination, (ci) the special questions from the 
second examination covering the contents of the (non-graded) Scientific American 
reprints. The correlations among these 8 outcome variables, separately com- 
puted in each of the 2 experimental courses, are presented in Table 5. These 
two correlation matrices were factor analyzed, using both a principal factors 
(R in the diagonal) and a principal components (unity in the diagonal) solution — 

Insert Table 5 about here 

each of which was rotated by one oblique and two orthogonal procedures. The 
data turned out to be so cleanly structured that all solutions gave quite simi- 
lar results. The rotated factor structures from each course, using the princi- 
pal components solution with a Varimax rotation, are presented in Table 6. 



O 
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Variable 

4 



5 

6 
8 



O 




Table 4 

Four Criterion Variables from the 
Course Evaluation Questionnaire 



Question 

How satisfied are you at the present time with 
this course? 



What is your reaction to the manner in which 
this course was taught? 

How does the probable long-range value for you of 
this course compare with all other courses you 
have had in college? 

How many Scientific American reprints- -of those 
assigned as supplementary reading- — have you 
read up to this time? 



Response Options 

1-9 (Extremely 
satisfied ■* 
Extremely 
dissatisfied) 

1-7 (Very 
disappointed -► 
Very delighted) 

1-5 (Lowest 
10% -* Highest 
10 %) 

0-9 (None 
17 or more) 
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Table 5 

Intercorrelations among the Eight Outcome 
Variables in Each of the Two Courses 

Course A 







1 


2 


3_ 


4 


_5 


6_ 


7_ 


8 


Mean 


a 


First exam 


1 




.60 


30 


-.11 


.07 


.19 


.06 


.18 


60.9 


7.1 


M-C sc' ?re 


2 


.56 




.28 


-.05 


.01 


.07 


.11 


.21 


41.6 


00 

. 

in 


Essay score 


3 


.35 


.31 




-.04 


-.01 


.07 


.01 


-.02 


50.2 


9.8 


Satisfaction 


4 


-.20 


-.07 


.01 




-.74 


-.53 


-.05 


-.06 


5.7 


2.2 


Reaction 


5 


.16 


.01 


.05 


-.71 




.57 


.03 


.00 


3.4 


1.7 


Long-range Value 


6 


.14 


.10 


.05 


-.47 


.49 




.06 


.04 


00 

. 

CM 


1.2 


Reading : Test score 


7 


.15 


.15 


.13 


.00 


.01 


.07 




.44 


i 2.3 


in 

. 

CM 


Reading : No . read 


8 


.27 


.30 


.08 


-.06 


.03 


.01 


.39 




7.9 


3.0 


Mean 

Course B 




51.1 


40.6 


50.1 


5.9 


2.9 


2.9 


2.3 


8.0 






a 




6.3 


5.5 


10.1 


2.2 


1.6 


1.1 


2.5 


3.3 







Note: — Correlations from Course A (N = 381) are listed above the main diagonal; 
those from Course B (N = 425) are listed below the diagonal. 
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Note the virtual identity of the factor structures in the the two courses. 

Insert Table 6 about here 

Using the analyses presented in Table 6, factor scores were computed 

for each student in each course, and these three factor scores (Achievement, 
Satisfaction, and Amount of Non-graded Reading) — plus the essay and multiple- 
choice sub-scores from the second examination- -were utilized as the five major 
outcome variables in all of the subsequent analyses. These five criteria, 
then, include three measures of course achievement (multiple-choice examination 
score, essay examination score, and over-all achievement factor score), one 
global measure of course satisfaction, and one measure of non-graded reading. 




Statistical Analyses 

Since the primary focus of this investigation was upon the demonstration 
of trait- by— treatment interaction effects, some comments are now in order con- 
cerning the procedures used to recognize — and to test the statistical signi- 
ficance of--such interactions. There are at least two classes of statistical 
test used for demonstrating a significant interaction effect. The first, and 
most common, is by means of a statistically significant F-ratio for a particu- 
lar interaction line in an analysis of variance (ANOVA) or covariance (ANOCA). 
The second is by means of a statistically significant difference between 
two or more correlation coefficients (rO or between two sets of re- 
gression weights obtained from linear regression analyses (R), Both classes 
of procedures are based upon an identical set of assumptions, namely those of 
the general linear model (e,g., Cohen, 1968), and both were utilized in 
the present project. 

In using the ANOVA or MOCA procedures to establish significant interaction 
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Table 6 

The Factor Structure of the Eight Outcome 
Variables in Each of the Two Courses 







I* 


II* 


III* 


h 2 




A 


B 


A B 


A B 


A 


B 


Variable 














First exam 


.00 


.76 






. 66 


.63 


M-C score* 


.73 


.69 






.55 


.52 


Essay score* 


.39 


.45 






.15 


.20 


Satisfaction 






-.83 -.83 




.69 


.69 


Reaction 






.90 .86 




.80 


.75 


Long-range Value 






.64 .56 




.42 


.32 


Reading: Test score 








.90 .90 


.82 


.84 


Reading : No . read 








.48 .41 


.23 


.18 



Note: — All loadings > .20 are tabled. Course A: N_ = 381; Course B: N = 425. 

Results are from a normalized Varimax rotation of the principal 
components analyses (unities in diagonal). 

^'Variables used for subsequent analyses (3 factor scores + 2 test scores). 




effects 9 one begins with a set of nominal (categorical) measures for each of 
two or more independent (and orthogonal) variables; the dependent variable 
is the outcome or criterion score of interest to the investigator. For ex- 
ample, using the present experimental design, we can examine the effects of 
Lecture (L) vs. Self-study (S) instruction, and Quiz (Q) vs. Paper (P) sections, 
upon the outcome variable of course achievement. Using the traditional ANOVA 
or ANOCA procedures, we can test for the significance of: (jO the L vs, S 

main effect, (b) the Q vs. P main effect, and finally (c) the L-S x Q-P "inter- 
action effect" — a series of significance tests which are discussed in Chapter 
III. However, it is important to bear in mind that this particular "interaction" 
is a treatment- by -treatment one, not a trait-by -treatment interaction of the 
sort for which we are searching. To test for the latter, we could dichotomize, 
trichotomize, or generally multichotomize the scores cn one or more person- 
ality scales of interest (e.g., Anxiety) and then test for the significance of: 
(a) the L vs. S main effect, (b) the Q vs. P main effect, (c) the High vs, 

Medium vs. Low Anxiety (A) main effect, (d) the L-S x Q-P (treatment) "inter- 
action (£) the L-S x A interaction, (f) the Q-P x A interaction, and finally 
(g) the L-S xQ-Px A interaction — the last three being examples of 
the sort of interactions we hope to discover. We could then estimate the pro- 
portion of the variance in the dependent (criterion) variable "attributable" 

2 

to each of the seven effects by means of some statistic like m (Hays, 1963). 

This procedure, while useful for variables which are naturally dichotomous 
(e.g., sex) or otherwise categorical (e.g., place of residence), is a cumber- 
some one for the mass of personality inventory scale scores of the sort used 
in this study. For this and other reasons, most of the findings relating to 
the interaction hypothesis (Chapters IV and V) will be presented in terms of 
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correlational differences. The correlations betvrcen each scale score and each 
of the five outcome variables were computed for the students in each course 
separately within each of the four cells of the experimental design. These 
correlations were computed separately for male and for female students. In 
addition, similar correlations were computed for male and for female students in 
each of the four experimental treatments: 

(L) Lecture (LQ and LP sections combined). 

(S) Self-study (SQ and SP sections combined). 

(Q) Quiz (LQ and SQ sections combined). 

(P) Paper (LP and SP sections combined). 

Since this is an exploratory investigation in which the relative significance 
of the L vs. S and the Q vs, P experimental treatments are unknown, it was de- 
cided <a priori to analyze the correlational differences between students exposed 
to the most structured (LQ) and the least structured (SP) sections, and the L vs. 

S and the Q vs. P teaching conditions. A significant difference 
in the correlations between students in any pair of these conditions across the 
two courses can then be interpreted analogously to a significant interaction in an 
ANOVA analysis which includes one treatment variable having two levels and one 
personality variable having many (ordered) levels. The procedures for testing 
the significance of correlational differences on a post hoc basis are detailed in 
Marascuilo (1966). In the present study, the procedure involved the calculation 
of 2 in the following equation: 



Z 
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where : 



^ 1-4 ~ t * le Z-converted correlation coefficients, 

each involving a test score and a criterion 
variable. 

n^_ 4 = the number of subjects in each condition, 
and where conditions 1 and 3 (e.g., LQ) and conditions 2 and 4 (e.j., SP) are 
equivalent conditions in Course A (1 and 2) and Course B (3 and 4), respectively. 

The following two hypothetical interaction effects illustrate this general 
methodology : 



Course A 



Course B 



Criterion A 
Trait X 



Q 

P 





S 




Criterion A 
Trait Y 



Q 

P 





The first hypothetical interaction, involving Trait X and Criterion A, illustrates 
the ideal case: a significant negative (or positive) correlation in the LQ cell 

and one of a similar size but of opposite sign in the SP cell. Such a pattern of 
correlational differehces, which is probably quite rare in psychology, cannot be 
represented by a linear model (i.e., only main effects) since the population corre-- 
lation is approximately zero. The second hypothetical example (for Trait 

Y), which is probably more likely to occur, represents cases where a personality 
measure is significantly related to a criterion among students in one treatment 
condition and is not so highly related among students in the other. These sorts 
of interactions are reasonably well predicted by linear models, since the re- 
gression lines do not cross, as they do in the first example. 
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Any significant interactions uncovered in the present study can stem 



primarily from the Lecture (L) vs. 


Self-study 


(S) treatment, 


e.g. : 




Course A Course B 


Course A 


Course 


B 


L 


S L 


S 


L 


s 


L 


S 


P -.30 


.30 -^30) 

(To) -.30 


,30 

© 


-.30 


.30 


-.30 


,30 


or from the Quiz (Q) 


vs. Paper (P) 


treatment, 


e.g. : 








Course A Course B 


Course A 


Course 


B 


L 


S L 


S 


L 


£ 


L 


S 


Q -0) 


-.30 -^30) 


-.30 Q 


- 


.30 


-.30 




P ,30 


^30) .30 


(^30) P 




.30 


.30 




or from their joint 


effects, e.g.: 














Course A 


Course B 








L 


S 


L 


13 








Q -0) 


.00 


-© 


.00 








P ,00 


© 


.00 


® 







These analyses should suggest whether the presence vs, absence of lectures is more 
important than the use of quizzes rather than papers in producing significant inter- 
actions with student personality characteristics, thus serving to guide future 
replications and extensions of the present findings. 







Chapter III 

ANALYSES OF THE MAIN EFFECTS 



Two major classes of main effects can be considered, namely those 
stemming from the experimental treatment interventions (the teaching methods) 
and those stemming from the personality characteristics (the attributes or 
traits) of the students, themselves. The effects of these two classes of 
variables upon the five criteria will each be presented in turn. 

The Experimental Teaching Methods 

The effects of the experimental variations in teaching method were 
examined by means of analyses of variance for each of the five outcome 
variables. Table 7 summarizes the results of 10 of these analyses (one for 
each of the five criteria, separately in each of the two courses). The values 
in parentheses in Table 7 are the point-biserial correlations between the 

Insert Table 7 about here 

students' instructional format (e.g., students in lecture sections were 
coded 1{ 0 n and those in self-study sections were coded "1") and their scores on 
the criterion variable. Thus these values, providing an index of the strength 
of the effects whose significance level is given by the analysis of variance, 
permit the reader to compare d’rectly the effects due to situations (experi- 
mental treatments) with those due to personality traits (student attributes). 



0 



28 



28a 



Table 7 

The Effects of the Experimental Teaching Conditions 
upon the Five Major Outcome Variables: 
Analyses of Variance and Correlations 



Outcome Variables 



Teaching 

Methods 



Course 

Achievement : 
Factor Score 



Course 

Satisfaction: 
Factor Score 



Non- graded 
Reading: 
Factor Score 





A 


1 1 


A 


B 1 


A 


B 


A 


B 1 A 

j 


B 


L vs. 


S 


I 

L>S * 




1 

| 








L>S . 


L>S 






F=6.1 | 




1 








F=10. 9 | 


F=7. 3 




n.s. 


p<.05 * 

| 


n.s. 


n.s. * 
1 


n.s. 


n.s. 


n.s. 


p<. 01 * n.s. 

1 


p<.01 




(-.04) 


(-.12) | 


(-.05) 


(.07) 


(-.O'l) 


(-.03) 


(-.08) 


(-.16) (.00) 


CO 

H 

! 


Q vs. 


P Q>P 


1 

1 




1 

I 


P>Q 


Q>P 


Q>P 


1 

| 






F=15.2 


1 




1 


F=4. 7 


F=7 . 8 


F=10 . 8 


1 






H 

O 

V 

P. 


n.s . * 
| 


n.s. 


n.s. » 

I 


p<.05 


p< .01 


p<.01 


I 

n.s. * n.s. 
| 


n.s 




(-.20) 


(-.06). 


(.06) 


(-.06) 


(.11) 


(-.13) 


(-.17) 


(-.06) (.03) 


(.04) 


Interaction 


1 

1 




1 

1 








1 






n.s. 


n.s. J 


n.s. 


n.s. | 


n.s. 


n.s. 


n.s. 


n.s. | n.s* 


n.s. 


Note : 


— Course A: 


N - 381; 


Course 


B: N = 425 













Mult lple- 
choice 
Test Score 



Essay 
Test Score 



L s Lecture Instruction 
S = Self-study Instruction 



Q = Quiz Sections 
P = Paper Sections 



Values in parentheses are the point-biserial correlations between teaching conditions 
and scores on the outcome variable. 




29 



using the same index of degree of association (the product -moment correlation). 

As Table 7 indicates, the experimental treatment variations did not 
produce any statistically significant main effects common to both of the two 
courses, a finding concordant with three decades of previous instructional 
research. All treatment effects were either non-significant in both courses 
(9 out of 15 analyses), significant in one but not the other course (five 
analyses), or significant in both courses but opposite in direction of effect 
(one analysis). Consequently, these results generally confirm the findings 
from past studies, namely that differences in instructional conditions do not 
show either sizeable or replicable main effects. 

While there were no differences in over-all course satisfaction on the 
part of students assigned to differing instructional procedures, there were 

t 

some interesting differences between experimental treatments in the 
students* implied choices for future courses. One question on the Course 
Evaluation Questionnaire, administered at the last session of each course, 
asked each student to indicate which type of section he "would now prefer" 
if he were enrolling in the course "at the present time." Table 8 presents 
the proportions of students in each of the four instructional formats who 
would elect each of the four types of instruction. Note that there was no 

Insert Table 8 about here 




consistent final preference for either Lecture or Self-study instruction (52% 
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Table 8 

The Relationship between Students 1 Experience in a Particular Treatment 
and Their Later Instructional Preferences (Both Courses Combined) 



Students Enrolled in: 



Final 




Note: — Preferences are from the Course Evaluation Questionnaire, administered 
during the last section meeting. Cell entries are proportions of 
those students enrolled in each of the four instructional formats. 
Circled entries represent students electing the treatments to which 
they had been assigned. 



O 

ERIC 



30 



vs, 48%), while there was such a general preference for Quiz sections (66%) 
over Paper sections (34%), However, this latter choice appeared to have been 
moderated dramatically by the students 1 actual course experiences: of those 

who were assigned to Quiz sections, only 18% elected a Paper section; on the 
other hand, of those who were assigned to Paper sections, half preferred the 
same type of section again. 

Fortunately, these same students had the opportunity to choose between 
frequent quizzes and frequent papers at their first section meeting — before 
they had actually taken any quizzes (or written any papers) in these particular 
courses*, at that time, they responded to a question from the Oregon Instructional 
Preference Inventory which asked for their choice between t; a course with frequent 
quizzes” and w a course requiring frequent papers,” Approximately 20% of the 
students in the quiz sections and approximately 30% of those in the paper 
sections claimed an initial preference for writing papers. Consequently, one 
might hypothesize that about 20% of this student sample would initially prefer 
writing papers to taking quizzes; while being enrolled in a course requiring 
papers may raise this proportion a bit,, the experience of actually writing 
papers raises the proportion quite substantially (50%). Since this finding 
suggests that experiencing an initially unpopular instructional treatment can 
change students 1 attitudes towards it, one might consider this fact before 
assigning students to treatments solely on the basis of their initial preferences. 

Finally, one other treatment effect deserves a brief mention. Students in 
the Quiz sections achieved higher scores than those in the Paper sections on 
each of the sets of 10 repeated quiz questions which had been embedded in the 
two content examinations (p < .01 on both examinations in both courses); 
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differences between students in the Lecture and the Self-study conditions were , 
not significant on either examination in either course. While this finding is 
hardly an electrifying one, it does attest to the fact that students can learn 
the answers to specific questions from previous quizzes (when adequate feedback 
is provided ) 3 though this learning does not necessarily generalize to other 
questions covering much the same content. 

The Student Personality Characteristics 

Table 9 presents the correlations between each of the five criterion variables 
and six student attributes typically considered to be related to course outcomes 
(GPA, SAT-V, SAT-M, course motivation, class in college, and sex). Note that 
the findings were virtually identical in both courses. Sex, class in college. 

Insert Table 9 about here 

and initial course motivation (whether the course was required or elective) had 
essentially no correlation with any of the five outcome variables. On the other 
hand, previous college GPA and the two measures of scholastic aptitude were 
related to all of the course achievement variables, and a number of these relation- 
ships were of quite substantial size (e.g. , previous GPA correlated .56 and 
.52 with the Course Achievement factor score in the two courses, respectively). 

In general, GPA correlated more highly with the course achievement criteria 
than did the SAT-Verbal score, which in turn was more predictive of these 
variables than was the SAT -Mathematical score. None of these measures, however, 
was related to course satisfaction. 

A comparison of Tables 7 and 9 highlights the differential effectiveness 
of experimental treatments (Table 7) vs. student attributes (Table 9) in pre- 
dicting course outcomes. While neither these treatments nor these attributes 

O 
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Table 9 

The Relationships between Six Student Attributes 
and the Five Major Outcome Variables 



Outcome Variables 



Student 

Attributes 

GPA 

SAT-V 

SAT-M 

Course 

motivation 3 

Class in 
college 

Sex 



Course 

Achievement : 
Factor Score 

A B 
.56 .52 

.46 .42 

.29 .23 

.03 .13 

.10 .05 

.02 .11 



Course 

Satisfaction : 
Factor Score 

A B 
-.02 -.02 
.00 .02 
-.03 .01 

.00 .18 
-.03 -.05 

-.08 .06 



Mon-graded 
Reading: 
Factor Score 

A B 
.16 .25 

.27 .26 

.11 .14 

.03 .04 

-.02 .02 

.04 .05 



Multiple- 
Choice 
Test Score 

A B 
.48 .45 

.45 .41 

.26 .25 

.02 .17 

.07 .06 

.08 ,05 



Essay 
Test Score 

A B 
.28 .24 

.26 .18 
.17 .06 

-.09 .08 

.11 .04 

-.02 .12 



Note: — Course A: N = 308; Course B: N = 369. All correlations > .15 are 



significantly different from zero at p < .01. 



a Self-report of whether the course was selected primarily to fulfill a college 
requirement" (0) or "primarily to gain knowledge of the contents of 
the course" (l). 
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enable one to predict course satisfaction, the case is very different for 
indices of course achievement. A substantial proportion (20%-40%) of the 
variance in achievement was predictable by student attributes, and virtually 
none by the instructional treatments. 

Table 10 provides an even more dramatic illustration of the differing 
validity of information from various data sources as predictors of the three 
criteria of course achievement. Correlations are presented separately for 

Insert Table 10 about here 

male and for female students. In general, the course achievement factor scores 
were slightly more predictable by all measures than were the multiple-choice 
test scores, which in turn were considerably more predictable than the essay 
test scores — a finding which conforms to expectations based upon the probable 
relative reliabilities of these three criterion indices. 

The data sources are ordered from the top to the bottom of Table 10 
roughly by their over-all validity, though only a subset of the significant pre- 
dictors from each data source are tabled. For the female sample, the best pre- 
dictor of the course achievement factor scores was past performance in other 
courses (GPA) with an average validity (r) across both courses of .61. The SAT- 
Verbal score (F = .44) and the Educational Set Scale (r = .42) were also highly 
predictive, followed closely by the female key from the 1956 Revision of the 
Survey of Study Habits and Attitudes (r = .36). Two scales from the Strong Vo- 
cational Interest Blank for Men (r = .27) and the Achievement via Independence 
scale from the CPI (r = .26), while less valid than the ability measures, were 
more predictive than any of the instructional effects, which produced essentially 
zero correlations with the achievement criteria. For the male sample, the results 
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Table 10 

A Comparison of Different Data Sources as Predictors of Course Achievement 



Male Students 



Past Performance 



Aptitude Test Scores 



Educational Set £ 
Study Habits 



Vocational Interests 
(SVIB) 



GPA 

SAT-V 

SAT-M 

ESS 

SSHA 

Psychologist 

Economist 



Inventory Predictors of CPI-Ai 

Academic Achievement n „ Tn . . 
SVIB-Ach 



Instructional 

Treatments 



L vs, S 
Q vs, 

LQ vs, SF C 



(N V 



,59 



,42 
. 34** 



j'n’f 



ft* * 









.18 
. 20 * 

. 22 ' 

. 27 



.21 

.27 

.01 

.01 

.02 

(186) 



B 

. 52 

.44* 

.34* 

fti 

.23 
. 21* 

,18 ; 

.19 S 

.26* 

. 15 



.05 

-.02 

.02 

(186) 



iV* 



ft 



ft* 



Female Students 






A 

. 54 S 
. 50* 

jt 

.25*’ 

.36 S 

.26 S 

.29* 

.25* 

. 30* 
.28* 



-.11 

.11 

.00 

(195) 






B 



.51 



.40 

.23* 



.34 

.37 

. 33 WM 
. 34*“ 



.27 

.20 

.08 

-.09 

-.01 

(239) 



.54 

.44 

.29 

.28 

,26 

.26 

.26 

.26 

.22 

.01 

.00 

.01 

(806) 



a Lecture sections (L) = 0; Self-study sections (S) = 1. 

^Quiz sections (Q) = 0; Paper sections (P) = 1. 

C LQ = 0; LP £ SQ = 1; SF = 2. 

^The sample sizes vary slightly from row to row, since not all of the subjects completed 
each inventory. 

*P < .05 

**p < . 01 
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were similar, though all of the personality inventory scales produced more 
uniform — and somewhat lower — validities (r - .20) than for the female sample. 

While these comparisons should be instructive for the continuing debate 
over the relative contributions of situations (treatments) vs. traits (attri- 
butes) as main effects in applied prediction problems (see Chapter VI), 
the focus of the present project is on potential situation-trait inte r actions . 
Consequently, let us turn now to the findings which have some direct bearing 
on the interaction hypothesis. 



O 
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Chapter IV: 



ANALYSES OF THE INTERACTION HYPOTHESIS: 
THE A PRIORI SCALES 



The first lactic for discovering any significant interactions between 
course treatments and student attributes involved a broad band-width pass 
through already existing personality measures. While the present chapter re- 
ports the findings from such an explicitly empirical sweep through more than 350 
a priori variables, the reader must bear in mind that since the number of per- 
sonality variables in this study was so large, most — if not all--of the inter- 
actions to be presented could have arisen by chance alone. A comparison between 
the number of significant interaction effects uncovered and the number expected 
by chance is included at the end of this chapter. 

Since the following material is rather technical, it may appeal more to 
the specialist in personality assessment than to the general reader. Consequently, 
a brief discussion of the overall organization of the chapter may be useful as a 
guide for the latter. The findings based on each personality inventory are re- 
ported in turn. For each inventory, the means and standard deviations of the 
scales scored in this project are tabled and discussed. Following the technical 
description of the scales, the most significant interactions involving these 
measures are presented. For readers interested in only one particular data 

Pages 35-38 
Pages 38-41 
Pages 41-42 
Page 43 
Pages 43-46 



source, the inventories are discussed in the following order: 

Previous GPA, aptitude test scores, etc. Tables 11-15 
California Psychological Inventory Tables 16-20 
Survey of Study Habits and Attitudes Tables 21-22 
Educational Set Scale Tables 23-24 
Strong Vocational Interest Blank Tables 25-30 
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Edwards Personal Preference Schedule 


Tables 


31-35 


Pages 


46-48 


Gough Adjective Check List 


Tables 


36-38 


Pages 


48-50 


Welsh Figure Preference Test 


Tables 


39-42 


Pages 


50-51 


Minnesota Multiphasic Personality Inventory 


Tables 


43-47 


Pages 


52-54 


Composite Personal Reaction Inventory 


Tables 


48-51 


Pages 


54-55 


Composite Choice Preference Inventory 


Tables 


52-54 


Pages 


56-57 


Bass Social Acquiescence Scale 


Tables 


55-57 


Pages 


57-58 


Reported Behavior Inventory 


Tables 


58-62 


Pages 


58-60 


Predicted Peer Ratings 


Tables 


63-68 


Pages 


60-63 



As a summary of the findings from all of the inventories, the most signi- 
ficant interactions with each criterion variable are presented in Tables 70-74 
(Pages 65-67). 

Previous GPA, Scholastic Aptitude, Sex, Class in College, and Initial 
Course Motivation 

Descriptive statistics for the present sample on previous GPA, scholastic 
aptitude, sex, class in college, and initial course motivation are presented in 
Table 11. While the sample was rather evenly split between males (46%) and 
females (54%), there was a heavy preponderance of sophomores (66%), with some 

Insert Table 11 about here 

juniors (23%) and a scattering of seniors (3%) and freshmen (3%). Mean scores 
on the Scholastic Aptitude Test were close to the national average (500). Male 
students scored slightly higher tnan females on the mathematical section of the 
aptitude test, while females scored slightly higher than males on the verbal 
section. The first year grade point average for the female students was slightly 
superior to that of their male counterparts. 



O 




Table 11 



Characteristics of the Sample: 

Sex, Class in College, Course Motivation* 

Past Academic Performance (GPA), and Scholastic Aptitude 

Males Females 





A 


B 


A 


B 






— 


— 


— 


Class in College 


Freshman 


5 


9 


6 


6 


Sophomore 


106 


113 


137 


172 


Junior 


49 


45 


45 


49 


Senior 


26 


19 


7 


12 


Course Motivation 


’’Required" 


108 


100 


124 


135 


’’Elective" 


78 


86 


71 


104 


(N) 


(186) 


(186) 


(195) 


(239) 


Past Performance £ 
Scholastic Aptitude 


GPA 


2.53 


2.53 


2.63 


2.60 


Mean SAT-V 


503 


503 


519 


515 


SAT-M 


534 


524 


491 


470 


GPA 


.48 


.46 


.47 


.45 


£ SAT-V 


88 


79 


84 


85 


SAT-M 


92 


92 


84 


83 


<N) 


(145) 


(159) 


(163) 


(210) 
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On the Course Evaluation Questionnaire, administered at the end of the 
term, students were asked the following question: "My major reason for enrolling 

in this course was: (1) primarily to fulfill a college requirement, or (2) pri- 

marily to gain knowledge of the contents of the course." While neither of these 
particular courses was specifically required for students at the University of 
Oregon, approximately 68% of the sample indicated that they had elected the 
course "primarily to fulfill a college requirement." One such University re- 
quirement makes students complete a year of study in each of three general 
areas: Arts and Letters, Social Science, and Natural Science. The Introductory 

Psychology sequence, of which the two experimental courses formed a part, could 
be used to satisfy either the Social Science or the Natural Science requirement. 
Consequently , this measure of initial course motivation should be understc /d as 
reflecting a contrast between an absolute interest in those courses as opposed 
to a more limited interest relative to other requirement -satisfying courses. 

While neither sex, class in colD.ege, nor .initial course motivation produced 
any significant interaction effects in either course, there were a few signifi- 
cant interactions involving previous GPA and the Scholastic Aptitude Test scores. 
Tabic 12 presents the correlations between previous grade point average and the 
course satisfaction outcome variable for students in different teaching conditions. 

Insert Table 12 about here 

Note that while there was a slight tendency for GPA to be correlated negatively 
with satisfaction in the Lecture (L) condition and positively in the Self-study 
(S) condition fov the total sample, this effect did not reach statistical signi- 
ficance when students of each sex were analyzed separately. 
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Table 12 

The Correlations between Previous Grade Point Average 
and the Course Satisfaction Outcome Variable 
in Different Teaching Conditions 









Course A 


Co»rse b 


z 


Course A 


Course B 


z 








L 


s 


L 


S 




L 


_s 


L 


s 




c 




Q 




.10 


-0 


.07 


4 % 










ft 


T A 

0 H 

1 P 




P 


-.05 


@ 


-.08 


© 


2.00 


-.12 


.07 


-.13 


.08 


2.56 






Q 


(94) 


(73) 


(81) 


(76) 














h 


(N) 


P 


(72) 


(69) 


(103) 


(109) 


- 


(166) 


(142) 


(184) 


(185) 




M 




Q 


s 


.10 


-© 


-.07 


2.17 


-.21 


.09 


-.08 


.08 


1.95 


A 

L 




P 


-.25 


0 


.09 


© 














E 

S 


(N) 


Q 


(42) 


(40) 


(40) 


(30) 




(79) 


(66) 


(89) 


(70) 








P 


(37) 


(26) 


(49) 


(40) 














p 




Q 


-© 


.11 


-0 


.14 














r 

E 

M 

A 




P 


.13 


© 


-.20 


© 


1.05 


-.07 


.09 


-.17 


.05 


1.78 


L 

T? 




Q 


(52) 


(33) 


(41) 


(46) 














o 


(N) 














(87) 


(76) 


(95) 


(115) 








P 


(35) 


(43) 


(54) 


(69) 
















Note : — Values in the table are correlations between GPA and the Course 

Satisfaction Factor Score. Critical comparisons have been circled. 



Q = Quiz Sections 
P --Paper Sections 



*p < . 05 



L = Lecture Instruction 
S = Self- study Instruction 
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Table 13 focuses on the interactions with the non-graded reading outcome 
variable. For female students, previous GPA and SAT-Verbal scores were more 
highly related to amount of extracurricular reading in the Quiz (Q) than in 

Insert Table 13 about here 

the Paper (P) sections. This effect, though statistically significant in the 
female sample, was not very large. 

Table 14 presents the interactions with two of the course achievement out- 
come variables. The results displayed in Table 14 indicate that, for the total 

Insert Table 14 about here 

sample and for the male subsample, there was a slight tendency for SAT -Mathematical 
scores to be more highly related to course achievement in the Self-study (S) than 
in the Lecture (L) sections. For female students, this effect, while in The 
same direction, was not statistically significant. 

Finally, Table 15 summarizes one highly significant interaction between 
A revious grade point average and the essay test score. Note that for male 

Insert Table 15 about here 

students, though not for female students, previous grade point average was 
relaced to performance on the essay test for students in the least structured 
sections (SP) and not so highly related for students in the most structured 
sections (LQ). For male students, the correlations between GPA and the essay 
test scores were considerably higher in the Paper (P) than in the Quiz (Q) 




sections 



